28 research outputs found

    Classic machine learning methods

    Full text link
    In this chapter, we present the main classic machine learning methods. A large part of the chapter is devoted to supervised learning techniques for classification and regression, including nearest-neighbor methods, linear and logistic regressions, support vector machines and tree-based algorithms. We also describe the problem of overfitting as well as strategies to overcome it. We finally provide a brief overview of unsupervised learning methods, namely for clustering and dimensionality reduction

    Classification de séries temporelles : une revue d'algorithmes et d'implémentations

    Get PDF
    International audienceTime series classification is a subfield of machine learning with numerous real-life applications. Due to the temporal structure of the input data, standard machine learning algorithms are usually not well suited to work on raw time series. Over the last decades, many algorithms have been proposed to improve the predictive performance and the scalability of state-of-the-art models. Many approaches have been investigated, ranging from deriving new metrics to developing bag-of-words models to imaging time series to artificial neural networks. In this review, we present in detail the major contributions made to this field and mention their most prominent extensions. We dedicate a section to each category of algorithms, with an intuitive introduction on the general approach, detailed theoretical descriptions and explicit illustrations of the major contributions, and mentions of their most prominent extensions. At last, we dedicate a section to publicly available resources, namely data sets and open-source software, for time series classification. A particular emphasis is made on enumerating the availability of the mentioned algorithms in the most popular libraries. The combination of theoretical and practical contents provided in this review will help the readers to easily get started on their own work on time series classification, whether it be theoretical or practical

    pyts: A Python Package for Time Series Classification

    Get PDF
    International audiencepyts is an open-source Python package for time series classification. This versatile toolbox provides implementations of many algorithms published in the literature, preprocessing functionalities, and data set loading utilities. pyts relies on the standard scientific Python packages numpy, scipy, scikit-learn, joblib, and numba, and is distributed under the BSD-3-Clause license. Documentation contains installation instructions, a detailed user guide, a full API description, and concrete self-contained examples. Source code and documentation can be downloaded from https://github.com/johannfaouzi/pyts

    Deep learning for brain disorders: from data processing to disease treatment

    Get PDF
    International audienceIn order to reach precision medicine and improve patients' quality of life, machine learning is increasingly used in medicine. Brain disorders are often complex and heterogeneous, and several modalities such as demographic, clinical, imaging, genetics and environmental data have been studied to improve their understanding. Deep learning, a subpart of machine learning, provides complex algorithms that can learn from such various data. It has become state of the art in numerous fields, including computer vision and natural language processing, and is also growingly applied in medicine. In this article, we review the use of deep learning for brain disorders. More specifically, we identify the main applications, the concerned disorders and the types of architectures and data used. Finally, we provide guidelines to bridge the gap between research studies and clinical routine

    Multivariate classification provides a neural signature of Tourette disorder: Running head: Multivariate analysis of Tourette disorder

    Get PDF
    International audienceBackground: Tourette disorder (TD), hallmarks of which are motor and vocal tics, has been related to functional abnormalities in large-scale brain networks. Using a fully-data driven approach in a prospective, case-control study, we tested the hypothesis that functional connectivity of these networks carries a neural signature of TD. Our aim was to investigate (i) the brain networks that distinguish adult patients with TD from controls, and (ii) the effects of antipsychotic medication on these networks. Methods: Using a multivariate analysis based on support vector machine (SVM), we developed a predictive model of resting state functional connectivity in 48 patients and 51 controls, and identified brain networks that were most affected by disease and pharmacological treatments. We also performed standard univariate analyses to identify differences in specific connections across groups. Results: SVM was able to identify TD with 67% accuracy (p=0.004), based on the connectivity in widespread networks involving the striatum, fronto-parietal cortical areas and the cerebellum. Medicated and unmedicated patients were discriminated with 69% accuracy (p=0.019), based on the connectivity among striatum, insular and cerebellar networks. Univariate approaches revealed differences in functional connectivity within the striatum in patients vs. controls, and between the caudate and insular cortex in medicated vs. unmedicated TD. Conclusions: SVM was able to identify a neuronal network that distinguishes patients with TD from control, as well as medicated and unmedicated patients with TD, holding a promise to identify imaging-based biomarkers of TD for clinical use and evaluation of the effects of treatment

    Predicting the Progression of Mild Cognitive Impairment Using Machine Learning: A Systematic and Quantitative Review

    Get PDF
    Context. Automatically predicting if a subject with Mild Cognitive Impairment (MCI) is going to progress to Alzheimer's disease (AD) dementia in the coming years is a relevant question regarding clinical practice and trial inclusion alike. A large number of articles have been published, with a wide range of algorithms, input variables, data sets and experimental designs. It is unclear which of these factors are determinant for the prediction, and affect the predictive performance that can be expected in clinical practice. We performed a systematic review of studies focusing on the automatic prediction of the progression of MCI to AD dementia. We systematically and statistically studied the influence of different factors on predictive performance. Method. The review included 172 articles, 93 of which were published after 2014. 234 experiments were extracted from these articles. For each of them, we reported the used data set, the feature types (defining 10 categories), the algorithm type (defining 12 categories), performance and potential methodological issues. The impact of the features and algorithm on the performance was evaluated using t-tests on the coefficients of mixed effect linear regressions. Results. We found that using cognitive, fluorodeoxyglucose-positron emission tomog-raphy or potentially electroencephalography and magnetoencephalography variables significantly improves predictive performance compared to not including them (p=0.046, 0.009 and 0.003 respectively), whereas including T1 magnetic resonance imaging, amyloid positron emission tomography or cerebrospinal fluid AD biomarkers does not show a significant effect. On the other hand, the algorithm used in the method does not have a significant impact on performance. We identified several methodological issues. Major issues, found in 23.5% of studies, include the absence of a test set, or its use for feature selection or parameter tuning. Other issues, found in 15.0% of studies, pertain to the usability of the method in clinical practice. We also highlight that short-term predictions are likely not to be better than predicting that subjects stay stable over time. Finally, we highlight possible biases in publications that tend not to publish methods with poor performance on large data sets, which may be censored as negative results. Conclusion. Using machine learning to predict MCI to AD dementia progression is a promising and dynamic field. Among the most predictive modalities, cognitive scores are the cheapest and less invasive, as compared to imaging. The good performance they offer question the wide use of imaging for predicting diagnosis evolution, and call for further exploring fine cognitive assessments. Issues identified in the studies highlight the importance of establishing good practices and guidelines for the use of machine learning as a decision support system in clinical practice

    Association between the LRP1B and APOE loci in the development of Parkinson’s disease dementia

    Get PDF
    Parkinson’s disease is one of the most common age-related neurodegenerative disorders. Although predominantly a motor disorder, cognitive impairment and dementia are important features of Parkinson’s disease, particularly in the later stages of the disease. However, the rate of cognitive decline varies among Parkinson’s disease patients, and the genetic basis for this heterogeneity is incompletely understood. To explore the genetic factors associated with rate of progression to Parkinson’s disease dementia, we performed a genome-wide survival meta-analysis of 3923 clinically diagnosed Parkinson’s disease cases of European ancestry from four longitudinal cohorts. In total, 6.7% of individuals with Parkinson’s disease developed dementia during study follow-up, on average 4.4 ± 2.4 years from disease diagnosis. We have identified the APOE ε4 allele as a major risk factor for the conversion to Parkinson’s disease dementia [hazard ratio = 2.41 (1.94–3.00), P = 2.32 × 10−15], as well as a new locus within the ApoE and APP receptor LRP1B gene [hazard ratio = 3.23 (2.17–4.81), P = 7.07 × 10−09]. In a candidate gene analysis, GBA variants were also identified to be associated with higher risk of progression to dementia [hazard ratio = 2.02 (1.21–3.32), P = 0.007]. CSF biomarker analysis also implicated the amyloid pathway in Parkinson’s disease dementia, with significantly reduced levels of amyloid β42 (P = 0.0012) in Parkinson’s disease dementia compared to Parkinson’s disease without dementia. These results identify a new candidate gene associated with faster conversion to dementia in Parkinson's disease and suggest that amyloid-targeting therapy may have a role in preventing Parkinson’s disease dementia

    Apprentissage automatique pour la prédiction des troubles du contrôle des impulsions dans la maladie de Parkinson

    Get PDF
    Impulse control disorders are a class of psychiatric disorders characterized by impulsivity. These disorders are common during the course of Parkinson's disease, decrease the quality of life of subjects, and increase caregiver burden. Being able to predict which individuals are at higher risk of developing these disorders and when is of high importance. The objective of this thesis is to study impulse control disorders in Parkinson's disease from the statistical and machine learning points of view, and can be divided into two parts. The first part consists in investigating the predictive performance of the altogether factors associated with these disorders in the literature. The second part consists in studying the association and the usefulness of other factors, in particular genetic data, to improve the predictive performance.Les troubles du contrôle de l'impulsivité sont une classe de troubles psychiatriques caractérisés par des difficultés dans la maîtrise de ses émotions, pensées et comportements. Ces troubles sont courants dans la maladie de Parkinson et associés à une baisse de la qualité de vie des patients ainsi qu'à une augmentation de la charge des aidants. Pouvoir prédire quels sont les sujets les plus à risque de développer ces troubles et quand ces troubles apparaissent est de grande importance. L'objectif de cette thèse est d'étudier les troubles du contrôle de l'impulsivité dans la maladie de Parkinson à partir des approches statistique et de l'apprentissage automatique, et se divise en deux parties. La première partie consiste à analyser la performance prédictive de l'ensemble des facteurs associés à ces troubles dans la littérature. La seconde partie consiste à étudier l'association et l'utilité d'autres facteurs, en particulier des données génétiques, pour améliorer la performance prédictive

    Classification de séries temporelles : une revue d'algorithmes et d'implémentations

    Get PDF
    International audienceTime series classification is a subfield of machine learning with numerous real-life applications. Due to the temporal structure of the input data, standard machine learning algorithms are usually not well suited to work on raw time series. Over the last decades, many algorithms have been proposed to improve the predictive performance and the scalability of state-of-the-art models. Many approaches have been investigated, ranging from deriving new metrics to developing bag-of-words models to imaging time series to artificial neural networks. In this review, we present in detail the major contributions made to this field and mention their most prominent extensions. We dedicate a section to each category of algorithms, with an intuitive introduction on the general approach, detailed theoretical descriptions and explicit illustrations of the major contributions, and mentions of their most prominent extensions. At last, we dedicate a section to publicly available resources, namely data sets and open-source software, for time series classification. A particular emphasis is made on enumerating the availability of the mentioned algorithms in the most popular libraries. The combination of theoretical and practical contents provided in this review will help the readers to easily get started on their own work on time series classification, whether it be theoretical or practical

    Classic machine learning algorithms

    No full text
    International audienceIn this chapter, we present the main classic machine learning algorithms. A large part of the chapter is devoted to supervised learning algorithms for classification and regression, including nearest-neighbor methods, linear and logistic regressions, support vector machines and tree-based algorithms. We also describe the problem of overfitting as well as strategies to overcome it. We finally provide a brief overview of unsupervised learning methods, namely for clustering and dimensionality reduction. The chapter does not cover neural networks and deep learning
    corecore